This document is supposed to give a compact overview of the current results that lead to the questions we now face.
Until now we only dealt with the bird ider (as it had the longest time series). We decided for the following preprocessing:
We aggregated each 30 second interval to one data point. The time series we examine now includes:
Therefore we first computed Step length and turning angle on a 1 Hz basis within each interval. For Step length we just computed the average as a summary statistic.
For the turning angle we observed that the soaring behaviour stands out by a more or less constant (slightly fluctuating) turning angle within the 30 second interval. However the turning direction is sometimes right and sometimes left which is not of interest for detecting this behaviour. Therefore for turning angle we computed the absolute value of the mean in each time interval. Afterwards it is scaled by a factor of \(1/\pi\) so we can model it using the Beta-distribution.
For the first difference of the height we also computed this within each interval and then computed the average.
We had to exclude some outliers, especially for height first difference, namely data points that showed an average height first difference above 20 m/s and below -110 m/s. This already leads to the first question: Is this resonable for the animal?
Afterwards the three time series looked like this:
One can observe that there are large Step lengths in combination with small turning angles and mostly downward movement, medium Step lengths in combination with turning angles around 0.15 and upward movement as well as barely any step length in combination with different turning angles and basically no vertical movement.
This made a 3 state HMM look reasonable with states:
We formulated this model without covariates and fitted it to the first 5000 observations (which is a one-way trans-himalayan migration). Which results in the transition probabilities and stationary distribution:
Transition probability matrix:
## [,1] [,2] [,3]
## [1,] 0.86800355 0.01962751 0.1123689
## [2,] 0.07021677 0.31375198 0.6160312
## [3,] 0.25451582 0.19366838 0.5518158
Stationary state distribution:
## [1] 0.61118746 0.09921031 0.28960223
Here we can observe a lack of fit as the model overestimates how often step lengths of 2-6 meters should appear.
So overall this model already looks promising. However the distinction between soaring and gliding is not as clear as we hoped. This is observable in the marginal plot above as well as in the scatterplot:
The model often also decodes upward movement as state 2 which should reflect only gliding. Therefore we looked into four state models.
First we fitted a four state HMM without covariates with the aim of finding a model that better seperates states 2 and 3. This yielded the following results:
Transition probability matrix:
## [,1] [,2] [,3] [,4]
## [1,] 0.79509498 0.02349971 0.07368018 0.10772513
## [2,] 0.09183038 0.27718824 0.54369743 0.08728395
## [3,] 0.20992394 0.17018923 0.43522963 0.18465720
## [4,] 0.22848433 0.03012788 0.11230147 0.62908632
Stationary state distribution:
## [1] 0.49752575 0.06943857 0.18172651 0.25130917
The fourth state fixes the problem with overestimation of step lengths between 2 and 6 meters so it seems to be necessary. This points us to the fact that there is a fourth behaviour which is āactive restingā that the 3 state model could not capture appropriately. More on this later.
Here states 1 to 3 are basically characterised by the same characteristics as in the three state HMM. The new fourth state we interpreted as some kind of resting behaviour as well but distinctly more active then the resting behaviour captured in state 1 as the step length is longer, there is less variability in the turning angles and more vertical movement. Introducing this fourth state leads to a clearer distincion between soaring and gliding.
Now that we got that far we started introducing covariates into the state switching process by expressing the transition probabilities as functions of those covariates. We started by introducing a dependency on the temperature.
This model seperates states 2 and 3 even better. Overall however the decoded states have not changed much.
We can now look at the hypothetical stationary state distribution as well as the transition probabilities depending on the external temperature:
As the transition probabilities depend on the temperature we cannot plot the component distributions against the histogram using the stationary distribution. However we can replace the stationary distribution by the relative frequency of the decoded states in this case. This way we still optain the following plots:
Especially here we can see that this model better seperates downwards and upwards movement when looking at the orange and green distributions for states 2 and 3 respectively.
The next model we fitted used the covariate Time of day to explain the transitions.
The latest model uses both: Temperature and time of day as covariates. This model is so far the best in terms of BIC.
We also fitted a model using the categorical variable landform. These were the results:
We can also have a look at the hypothetical stationary distribution and transition probabilities as functions of the landform:
(C = Cliff, LS = Lower/ Slope, MD = Mountain/ Divide, PR = Peak/ Ridge, US = Upper slope, V = Valley)
In addition to the large HMM with landform as covariate we fitted a multinomial regression model to the decoded states, which in theory is not acutally the right way to do as the data points are not independent, however the results are easier to interpret. They are shown below. State 2 is chosen as reference and the coefficients for Cliff are in the Intercept column.
## # weights: 36 (24 variable)
## initial value 6931.471806
## iter 10 value 6119.894878
## iter 20 value 5828.840183
## iter 30 value 5824.275496
## iter 40 value 5824.201595
## final value 5824.201499
## converged
## Call:
## multinom(formula = states ~ l_ + temp + elevation, data = data5)
##
## Coefficients:
## (Intercept) l_Lower slope l_Mountain/divide l_Peak/ridge l_Upper slope
## 1 11.246272 -8.697109 -27.07686 -9.139732 -9.740135
## 3 14.371103 -11.344661 -11.16987 -11.324433 -11.376805
## 4 -7.229348 9.353725 -11.36170 10.431981 9.745529
## l_Valley temp elevation
## 1 -9.577025 -0.01423847 -2.035722e-04
## 3 -11.501040 -0.07693129 -3.762851e-05
## 4 8.781534 -0.04119853 -3.106104e-04
##
## Std. Errors:
## (Intercept) l_Lower slope l_Mountain/divide l_Peak/ridge l_Upper slope
## 1 0.004290278 0.05931676 3.128141e-11 0.005905521 0.06205703
## 3 0.003851639 0.06817014 5.329467e-04 0.004015567 0.06696330
## 4 0.005188757 0.06858732 1.444550e-12 0.009533563 0.06514085
## l_Valley temp elevation
## 1 0.05027957 0.001926316 4.156599e-05
## 3 0.05801797 0.002241908 4.118436e-05
## 4 0.05546861 0.002215903 4.734610e-05
##
## Residual Deviance: 11648.4
## AIC: 11696.4
These results lead to the following questions: